Re-ranking Summaries Based on Cross-Document Information Extraction
نویسندگان
چکیده
This paper describes a novel approach of improving multi-document summarization based on cross-document information extraction (IE). We describe a method to automatically incorporate IE results into sentence ranking. Experiments have shown our integration methods can significantly improve a high-performing multi-document summarization system, according to the ROUGE-2 and ROUGE-SU4 metrics (7.38%% relative improvement on ROUGE-2 recall), and the generated summaries are preferred by human subjects (0.78 higher TAC Content score and 0.11 higher Readability/Fluency score).
منابع مشابه
Multi-document Summarization via Information Extraction: A Revisit
This paper describes a novel approach of improving multi-document summarization based on cross-document information extraction (IE). We first show that IE itself is not sufficient to produce fluent and coherent summaries. Then we attempt various methods to automatically incorporate IE results into sentence ranking. Experiments have shown our integration methods can significantly improve a high-...
متن کاملAMDS: Sentence Extraction Based Proficient Framework For Multi-Document Summarization
Rapid improvement of electronic documents in World Wide Web has made overload to the users in accessing the information. Therefore, abstracting the primary content from numerous documents related to same topic is highly essential. Summarization of multiple documents helps in valuable decision-making in less time. This paper proposed a framework named Adept Multi-Document Summarization (AMDS) fo...
متن کاملUsing Bilingual Information for Cross-Language Document Summarization
Cross-language document summarization is defined as the task of producing a summary in a target language (e.g. Chinese) for a set of documents in a source language (e.g. English). Existing methods for addressing this task make use of either the information from the original documents in the source language or the information from the translated documents in the target language. In this study, w...
متن کاملA semantic partition based text mining model for document classification
Feature Extraction is a mechanism used to extract key phrases from any given text documents. This extraction can be weighted, ranked or semantic based. Weighted and Ranking based feature extraction normally assigns scores to extracted words based on various heuristics. Highest scoring words are seen as important. Semantic based extractions normally try to understand word meanings, and words wit...
متن کاملTHUIR at TREC2008: Relevance Feedback Track1
Tsinghua University Information Retrieval Group (THUIR) has participated into the first Relevance Feedback Track of TREC2008. The TMiner search engine has been used as our text retrieval system, because the processing capability and flexibility of this system on large text data has been testified during many years’ Web Track and Terabyte Track. In the track, we studied two approaches: 1) query ...
متن کامل